24 research outputs found
Adaptive Asynchronous Control Using Meta-learned Neural Ordinary Differential Equations
Model-based Reinforcement Learning and Control have demonstrated great
potential in various sequential decision making problem domains, including in
robotics settings. However, real-world robotics systems often present
challenges that limit the applicability of those methods. In particular, we
note two problems that jointly happen in many industrial systems: 1)
Irregular/asynchronous observations and actions and 2) Dramatic changes in
environment dynamics from an episode to another (e.g. varying payload inertial
properties). We propose a general framework that overcomes those difficulties
by meta-learning adaptive dynamics models for continuous-time prediction and
control. The proposed approach is task-agnostic and can be adapted to new tasks
in a straight-forward manner. We present evaluations in two different robot
simulations and on a real industrial robot.Comment: 16 double column pages, 14 figures, 3 table
Behavioral Repertoire via Generative Adversarial Policy Networks
Learning algorithms are enabling robots to solve increasingly challenging
real-world tasks. These approaches often rely on demonstrations and reproduce
the behavior shown. Unexpected changes in the environment may require using
different behaviors to achieve the same effect, for instance to reach and grasp
an object in changing clutter. An emerging paradigm addressing this robustness
issue is to learn a diverse set of successful behaviors for a given task, from
which a robot can select the most suitable policy when faced with a new
environment. In this paper, we explore a novel realization of this vision by
learning a generative model over policies. Rather than learning a single
policy, or a small fixed repertoire, our generative model for policies
compactly encodes an unbounded number of policies and allows novel controller
variants to be sampled. Leveraging our generative policy network, a robot can
sample novel behaviors until it finds one that works for a new environment. We
demonstrate this idea with an application of robust ball-throwing in the
presence of obstacles. We show that this approach achieves a greater diversity
of behaviors than an existing evolutionary approach, while maintaining good
efficacy of sampled behaviors, allowing a Baxter robot to hit targets more
often when ball throwing in the presence of obstacles.Comment: In Proceedings of 2019 Joint IEEE 9th International Conference on
Development and Learning and Epigenetic Robotics (ICDL-EpiRob), pages 320 -
32
Discovering Representations for Black-box Optimization
The encoding of solutions in black-box optimization is a delicate,
handcrafted balance between expressiveness and domain knowledge -- between
exploring a wide variety of solutions, and ensuring that those solutions are
useful. Our main insight is that this process can be automated by generating a
dataset of high-performing solutions with a quality diversity algorithm (here,
MAP-Elites), then learning a representation with a generative model (here, a
Variational Autoencoder) from that dataset. Our second insight is that this
representation can be used to scale quality diversity optimization to higher
dimensions -- but only if we carefully mix solutions generated with the learned
representation and those generated with traditional variation operators. We
demonstrate these capabilities by learning an low-dimensional encoding for the
inverse kinematics of a thousand joint planar arm. The results show that
learned representations make it possible to solve high-dimensional problems
with orders of magnitude fewer evaluations than the standard MAP-Elites, and
that, once solved, the produced encoding can be used for rapid optimization of
novel, but similar, tasks. The presented techniques not only scale up quality
diversity algorithms to high dimensions, but show that black-box optimization
encodings can be automatically learned, rather than hand designed.Comment: Presented at GECCO 2020 -- v2 (Previous title 'Automating
Representation Discovery with MAP-Elites'
DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics
Robots are still limited to controlled conditions, that the robot designer
knows with enough details to endow the robot with the appropriate models or
behaviors. Learning algorithms add some flexibility with the ability to
discover the appropriate behavior given either some demonstrations or a reward
to guide its exploration with a reinforcement learning algorithm. Reinforcement
learning algorithms rely on the definition of state and action spaces that
define reachable behaviors. Their adaptation capability critically depends on
the representations of these spaces: small and discrete spaces result in fast
learning while large and continuous spaces are challenging and either require a
long training period or prevent the robot from converging to an appropriate
behavior. Beside the operational cycle of policy execution and the learning
cycle, which works at a slower time scale to acquire new policies, we introduce
the redescription cycle, a third cycle working at an even slower time scale to
generate or adapt the required representations to the robot, its environment
and the task. We introduce the challenges raised by this cycle and we present
DREAM (Deferred Restructuring of Experience in Autonomous Machines), a
developmental cognitive architecture to bootstrap this redescription process
stage by stage, build new state representations with appropriate motivations,
and transfer the acquired knowledge across domains or tasks or even across
robots. We describe results obtained so far with this approach and end up with
a discussion of the questions it raises in Neuroscience